A Transformer-Based Approach to Authorship Attribution in Classical Arabic Texts
نویسندگان
چکیده
Authorship attribution (AA) is a field of natural language processing that aims to attribute text its author. Although the literature includes several studies on Arabic AA in general, applying classical texts has not gained similar attention. This study focuses investigating recent pretrained transformer-based models rarely studied domain with limited research contributions: Islamic law. We adopt an experimental approach investigate AA. Because no dataset been designed specifically for this task, we design and build our own using law digital resources. conduct experiments fine-tuning four models: AraBERT, AraELECTRA, ARBERT, MARBERT. Results indicate task attributing given author, ARBERT AraELECTRA outperform other accuracy 96%. conclude transformer models, fine-tuned legal dataset, show significant results texts.
منابع مشابه
Authorship Attribution of Texts: A Review
We study the authorship attribution of documents given some prior stylistic characteristics of the author’s writing extracted from a corpus of known works, e.g., authentication of disputed documents or literary works. Although the pioneering paper based on word length histograms appeared at the very end of the nineteenth century, the resolution power of this and other stylometry approaches is y...
متن کاملA Web-Based Self-training Approach for Authorship Attribution
As any other text categorization task, authorship attribution requires a large number of training examples. These examples, which are easily obtained for most of the tasks, are particularly difficult to obtain for this case. Based on this fact, in this paper we investigate the possibility of using Webbased text mining methods for the identification of the author of a given poem. In particular, ...
متن کاملAuthorship Attribution and Author Profiling of Lithuanian Literary Texts
In this work we are solving authorship attribution and author profiling tasks (by focusing on the age and gender dimensions) for the Lithuanian language. This paper reports the first results on literary texts, which we compared to the results, previously obtained with different functional styles and language types (i.e., parliamentary transcripts and forum posts). Using the Naïve Bayes Multinom...
متن کاملConvolutional Neural Networks for Authorship Attribution of Short Texts
We present a model to perform authorship attribution of tweets using Convolutional Neural Networks (CNNs) over character n-grams. We also present a strategy that improves model interpretability by estimating the importance of input text fragments in the predicted classification. The experimental evaluation shows that text CNNs perform competitively and are able to outperform previous methods.
متن کاملAuthorship Attribution for Small Texts: Literary and Forensic Experiments
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. To copy otherwise, to republish, to post on servers or to redistribute to lists, requires prior specific permission ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied sciences
سال: 2023
ISSN: ['2076-3417']
DOI: https://doi.org/10.3390/app13127255